347 research outputs found

    A syntactic language model based on incremental CCG parsing

    Get PDF
    Syntactically-enriched language models (parsers) constitute a promising component in applications such as machine translation and speech-recognition. To maintain a useful level of accuracy, existing parsers are non-incremental and must span a combinatorially growing space of possible structures as every input word is processed. This prohibits their incorporation into standard linear-time decoders. In this paper, we present an incremental, linear-time dependency parser based on Combinatory Categorial Grammar (CCG) and classification techniques. We devise a deterministic transform of CCGbank canonical derivations into incremental ones, and train our parser on this data. We discover that a cascaded, incremental version provides an appealing balance between efficiency and accuracy

    A syntactified direct translation model with linear-time decoding

    Get PDF
    Recent syntactic extensions of statistical translation models work with a synchronous context-free or tree-substitution grammar extracted from an automatically parsed parallel corpus. The decoders accompanying these extensions typically exceed quadratic time complexity. This paper extends the Direct Translation Model 2 (DTM2) with syntax while maintaining linear-time decoding. We employ a linear-time parsing algorithm based on an eager, incremental interpretation of Combinatory Categorial Grammar (CCG). As every input word is processed, the local parsing decisions resolve ambiguity eagerly, by selecting a single supertag–operator pair for extending the dependency parse incrementally. Alongside translation features extracted from the derived parse tree, we explore syntactic features extracted from the incremental derivation process. Our empirical experiments show that our model significantly outperforms the state-of-the art DTM2 system

    Supertagged phrase-based statistical machine translation

    Get PDF
    Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. We describe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. Despite the differences between these two approaches, the supertaggers give similar improvements. In addition to supertagging, we also explore the utility of a surface global grammaticality measure based on combinatory operators. We perform various experiments on the Arabic to English NIST 2005 test set addressing issues such as sparseness, scalability and the utility of system subcomponents. Our best result (0.4688 BLEU) improves by 6.1% relative to a state-of-theart PBSMT model, which compares very favourably with the leading systems on the NIST 2005 task

    MaTrEx: the DCU machine translation system for IWSLT 2007

    Get PDF
    In this paper, we give a description of the machine translation system developed at DCU that was used for our second participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2007). In this participation, we focus on some new methods to improve system quality. Specifically, we try our word packing technique for different language pairs, we smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the high number of out of vocabulary items, and finally we deploy a translation-based model for case and punctuation restoration

    An efficient numerical method for the modified regularized long wave equation using Fourier spectral method

    Get PDF
    AbstractThe modified regularized long wave (MRLW) equation is numerically solved using Fourier spectral collection method. The MRLW equation is discretized in space variable by the Fourier spectral method and Leap-Frog method for time dependence. To validate the efficiency, accuracy and simplicity of the used method, four cases study are solved. The single soliton wave motion, interaction of two solitary waves, interaction of three solitary waves and a Maxwellian initial condition pulse are studied. The L2 and L∞ error norms are computed for the motion of single solitary waves. To determine the conservation properties of the MRLW equation three invariants of motion are evaluated for all test problems

    Syntactic phrase-based statistical machine translation

    Get PDF
    Phrase-based statistical machine translation (PBSMT) systems represent the dominant approach in MT today. However, unlike systems in other paradigms, it has proven difficult to date to incorporate syntactic knowledge in order to improve translation quality. This paper improves on recent research which uses 'syntactified' target language phrases, by incorporating supertags as constraints to better resolve parse tree fragments. In addition, we do not impose any sentence-length limit, and using a log-linear decoder, we outperform a state-of-the-art PBSMT system by over 1.3 BLEU points (or 3.51% relative) on the NIST 2003 Arabic-English test corpus

    Improving Traffic Safety And Drivers\u27 Behavior In Reduced Visibility Conditions

    Get PDF
    This study is concerned with the safety risk of reduced visibility on roadways. Inclement weather events such as fog/smoke (FS), heavy rain (HR), high winds, etc, do affect every road by impacting pavement conditions, vehicle performance, visibility distance, and drivers’ behavior. Moreover, they affect travel demand, traffic safety, and traffic flow characteristics. Visibility in particular is critical to the task of driving and reduction in visibility due FS or other weather events such as HR is a major factor that affects safety and proper traffic operation. A real-time measurement of visibility and understanding drivers’ responses, when the visibility falls below certain acceptable level, may be helpful in reducing the chances of visibility-related crashes. In this regard, one way to improve safety under reduced visibility conditions (i.e., reduce the risk of visibility related crashes) is to improve drivers’ behavior under such adverse weather conditions. Therefore, one of objectives of this research was to investigate the factors affecting drivers’ stated behavior in adverse visibility conditions, and examine whether drivers rely on and follow advisory or warning messages displayed on portable changeable message signs (CMS) and/or variable speed limit (VSL) signs in different visibility, traffic conditions, and on two types of roadways; freeways and two-lane roads. The data used for the analyses were obtained from a self-reported questionnaire survey carried out among 566 drivers in Central Florida, USA. Several categorical data analysis techniques such as conditional distribution, odds’ ratio, and Chi-Square tests were applied. In addition, two modeling approaches; bivariate and multivariate probit models were estimated. The results revealed that gender, age, road type, visibility condition, and familiarity with VSL signs were the significant factors affecting the likelihood of reducing speed following CMS/VSL instructions in reduced visibility conditions. Other objectives of this survey study were to determine the content of messages that iv would achieve the best perceived safety and drivers’ compliance and to examine the best way to improve safety during these adverse visibility conditions. The results indicated that Caution-fog ahead-reduce speed was the best message and using CMS and VSL signs together was the best way to improve safety during such inclement weather situations. In addition, this research aimed to thoroughly examine drivers’ responses under low visibility conditions and quantify the impacts and values of various factors found to be related to drivers’ compliance and drivers’ satisfaction with VSL and CMS instructions in different visibility and traffic conditions. To achieve these goals, Explanatory Factor Analysis (EFA) and Structural Equation Modeling (SEM) approaches were adopted. The results revealed that drivers’ satisfaction with VSL/CMS was the most significant factor that positively affected drivers’ compliance with advice or warning messages displayed on VSL/CMS signs under different fog conditions followed by driver factors. Moreover, it was found that roadway type affected drivers’ compliance to VSL instructions under medium and heavy fog conditions. Furthermore, drivers’ familiarity with VSL signs and driver factors were the significant factors affecting drivers’ satisfaction with VSL/CMS advice under reduced visibility conditions. Based on the findings of the survey-based study, several recommendations are suggested as guidelines to improve drivers’ behavior in such reduced visibility conditions by enhancing drivers’ compliance with VSL/CMS instructions. Underground loop detectors (LDs) are the most common freeway traffic surveillance technologies used for various intelligent transportation system (ITS) applications such as travel time estimation and crash detection. Recently, the emphasis in freeway management has been shifting towards using LDs data to develop real-time crash-risk assessment models. Numerous v studies have established statistical links between freeway crash risk and traffic flow characteristics. However, there is a lack of good understanding of the relationship between traffic flow variables (i.e. speed, volume and occupancy) and crashes that occur under reduced visibility (VR crashes). Thus, another objective of this research was to explore the occurrence of reduced visibility related (VR) crashes on freeways using real-time traffic surveillance data collected from loop detectors (LDs) and radar sensors. In addition, it examines the difference between VR crashes to those occurring at clear visibility conditions (CV crashes). To achieve these objectives, Random Forests (RF) and matched case-control logistic regression model were estimated. The results indicated that traffic flow variables leading to VR crashes are slightly different from those variables leading to CV crashes. It was found that, higher occupancy observed about half a mile between the nearest upstream and downstream stations increases the risk for both VR and CV crashes. Moreover, an increase of the average speed observed on the same half a mile increases the probability of VR crash. On the other hand, high speed variation coupled with lower average speed observed on the same half a mile increase the likelihood of CV crashes. Moreover, two issues that have not explicitly been addressed in prior studies are; (1) the possibility of predicting VR crashes using traffic data collected from the Automatic Vehicle Identification (AVI) sensors installed on Expressways and (2) which traffic data is advantageous for predicting VR crashes; LDs or AVIs. Thus, this research attempts to examine the relationships between VR crash risk and real-time traffic data collected from LDs installed on two Freeways in Central Florida (I-4 and I-95) and from AVI sensors installed on two vi Expressways (SR 408 and SR 417). Also, it investigates which data is better for predicting VR crashes. The approach adopted here involves developing Bayesian matched case-control logistic regression using the historical VR crashes, LDs and AVI data. Regarding models estimated based on LDs data, the average speed observed at the nearest downstream station along with the coefficient of variation in speed observed at the nearest upstream station, all at 5-10 minute prior to the crash time, were found to have significant effect on VR crash risk. However, for the model developed based on AVI data, the coefficient of variation in speed observed at the crash segment, at 5-10 minute prior to the crash time, affected the likelihood of VR crash occurrence. Argument concerning which traffic data (LDs or AVI) is better for predicting VR crashes is also provided and discussed
    • 

    corecore